Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce initial pipeline load time by 4-5x (1/3) #149

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

Rypo
Copy link

@Rypo Rypo commented Nov 28, 2024

Changes

  • Adds an option from_pretrained(low_cpu_mem_usage=True) (akin to the transformers implementation, but greatly simplified) to OmniGen and OmniGenPipeline
  • Use accelerate init_empty_weights context manager when initializing the model. This avoids slow CPU weight initialization, particularly during self.initialize_weights().

These weights are immediately overwritten when the state_dict is loaded. This means we can safely bypass initialization without consequence.

Additionally, this can achieved with no additional libraries beyond those in requirements.txt. As such, I set the default as low_cpu_mem_usage=True.

Results

From my tests, this change:

  • Reduces the initial pipeline load time by 4-5x and
  • Decreases peak initial RAM usage by 10-15GB

Cold Load

New process + memory freed

low_cpu_mem_usage avg load time RAM usage
True 9.53s 18GB
False 41.56s 28GB

Hot Load

pipe.from_pretrained...; del pipe; gc.collect(); pipe.from_pretrained...

low_cpu_mem_usage avg load time RAM usage
True 5.07s 18GB
False 36.64s 33GB

This is the first of 3 PRs I'm issuing to improve performance/fix errors. I've tried to keep each incremental change as small in scope as possible. PRs: 1. This, 2. #150, 3. #151

Rypo added 2 commits November 25, 2024 19:39
Prevents slow CPU initialization of model weights on load by using accelerate `init_empty_weights`.

Completely compatible with from_pretrained since weights will always be overwritten by state_dict

fixes VectorSpaceLab#72
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant